A Frame Pruning Approach for Paralinguistic Recognition Tasks
نویسندگان
چکیده
In conventional paralinguistic classification approaches, information gained by low level features is described over broad segments (like whole turns) via statistical functionals. This procedure presumes meaningful information to be embodied within the whole segment. This assumption may be misleading if distinctive cues within a sample are surrounded by non-meaningful information or noise. In this case it would surely be beneficial to keep only parts of the sample that are most relevant for the recognition task. In this paper we propose a novel cluster-based approach, which aims at identifying frames likely to carry distinctive information. Evaluation is done within the INTERSPEECH 2012 Speaker Trait Challenge. Results show that under certain configurations frame pruning in fact leads to an improvement in recognition accuracy. On the observed corpus most stable improvements were achieved at a frame drop of 4-8%.
منابع مشابه
A confidence-guided dynamic pruning approach - utilization of confidence measurement in speech recognition
Improved efficiency of pruning accelerates the search process and leads to a more time efficient speech recognition system. The goal of this work was to develop a new pruning technique which optimizes the well known probability-based pruning (beam width) by utilization of confidence measurement. We use normalized hypotheses scores to guide the beam width of the pruning process dynamically frame...
متن کاملProgressive Neural Networks for Transfer Learning in Emotion Recognition
Many paralinguistic tasks are closely related and thus representations learned in one domain can be leveraged for another. In this paper, we investigate how knowledge can be transferred between three paralinguistic tasks: speaker, emotion, and gender recognition. Further, we extend this problem to cross-dataset tasks, asking how knowledge captured in one emotion dataset can be transferred to an...
متن کاملFrame pruning for speaker recognition
In this paper, we propose a frame selection procedure for textindependent speaker identification. Instead of averaging the frame likelihoods along the whole test utterance, some of these are rejected (pruning) and the final score is computed with a limited number of frames. This pruning stage requires a prior frame level likelihood normalization in order to make comparison between frames meanin...
متن کاملLaughter Classification Using Deep Rectifier Neural Networks with a Minimal Feature Subset
Laughter is one of the most important paralinguistic events, and it has specific roles in human conversation. The automatic detection of laughter occurrences in human speech can aid automatic speech recognition systems as well as some paralinguistic tasks such as emotion detection. In this study we apply Deep Neural Networks (DNN) for laughter detection, as this technology is nowadays considere...
متن کاملRecognition on Finite Set of Events: Bayesian Analysis of Generalization Ability and Classification Tree Pruning
The problem of recognition on finite set of events is considered. The generalization ability of classifiers for this problem is studied within the Bayesian approach. The method for non-uniform prior distribution specification on recognition tasks is suggested. It takes into account the assumed degree of intersection between classes. The results of the analysis are applied for pruning of classif...
متن کامل